New statistical method for machine-printed Arabic character recognition
Identifieur interne : 001387 ( Main/Exploration ); précédent : 001386; suivant : 001388New statistical method for machine-printed Arabic character recognition
Auteurs : HUA WANG [République populaire de Chine] ; XIAOQING DING [République populaire de Chine] ; JIANMING JIN [République populaire de Chine] ; HALMURAT [République populaire de Chine]Source :
- SPIE proceedings series [ 1017-2653 ] ; 2005.
Descripteurs français
- Pascal (Inist)
- Méthode statistique, Caractère imprimé, Arabe, Reconnaissance caractère, Reconnaissance optique caractère, Chinois, Jeu caractère, Extraction caractéristique, Fonction quadratique, Fonction discriminante, Classification automatique, Appareillage essai, Reconnaissance forme, Traitement signal, Classification signal.
- Wicri :
- topic : Méthode statistique.
English descriptors
- KwdEn :
Abstract
Although about 300 million people worldwide, in several different languages, take Arabic characters for writing, Arabic OCR has not been researched as thoroughly as other widely used characters (Latin or Chinese). In this paper, a new statistical method is developed to recognize machine-printed Arabic characters. Firstly, the entire Arabic character set is pre-classified into 32 sub-sets in terms of character forms, special zones that characters occupy and component information. Then directional features are extracted based on which modified quadratic discriminant function (MQDF) is utilized as classifier to deal with classification task. Finally, similar characters are discriminated before outputting recognition results. Encouraging experimental results on test sets show the validity of proposed method.
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000462
- to stream PascalFrancis, to step Curation: 000326
- to stream PascalFrancis, to step Checkpoint: 000395
- to stream Main, to step Merge: 001425
- to stream Main, to step Curation: 001387
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">New statistical method for machine-printed Arabic character recognition</title>
<author><name sortKey="Hua Wang" sort="Hua Wang" uniqKey="Hua Wang" last="Hua Wang">HUA WANG</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Image Division, Dept. of Electronic Engineering, Tsinghua Univ</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Xiaoqing Ding" sort="Xiaoqing Ding" uniqKey="Xiaoqing Ding" last="Xiaoqing Ding">XIAOQING DING</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Image Division, Dept. of Electronic Engineering, Tsinghua Univ</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Jianming Jin" sort="Jianming Jin" uniqKey="Jianming Jin" last="Jianming Jin">JIANMING JIN</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Image Division, Dept. of Electronic Engineering, Tsinghua Univ</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Halmurat" sort="Halmurat" uniqKey="Halmurat" last="Halmurat">HALMURAT</name>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>School of Information, Xinjiang Univ</s1>
<s2>Urumqi 830046</s2>
<s3>CHN</s3>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<wicri:noRegion>Urumqi 830046</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">05-0360134</idno>
<date when="2005">2005</date>
<idno type="stanalyst">PASCAL 05-0360134 INIST</idno>
<idno type="RBID">Pascal:05-0360134</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000462</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000326</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000395</idno>
<idno type="wicri:doubleKey">1017-2653:2005:Hua Wang:new:statistical:method</idno>
<idno type="wicri:Area/Main/Merge">001425</idno>
<idno type="wicri:Area/Main/Curation">001387</idno>
<idno type="wicri:Area/Main/Exploration">001387</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">New statistical method for machine-printed Arabic character recognition</title>
<author><name sortKey="Hua Wang" sort="Hua Wang" uniqKey="Hua Wang" last="Hua Wang">HUA WANG</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Image Division, Dept. of Electronic Engineering, Tsinghua Univ</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Xiaoqing Ding" sort="Xiaoqing Ding" uniqKey="Xiaoqing Ding" last="Xiaoqing Ding">XIAOQING DING</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Image Division, Dept. of Electronic Engineering, Tsinghua Univ</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Jianming Jin" sort="Jianming Jin" uniqKey="Jianming Jin" last="Jianming Jin">JIANMING JIN</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Image Division, Dept. of Electronic Engineering, Tsinghua Univ</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Halmurat" sort="Halmurat" uniqKey="Halmurat" last="Halmurat">HALMURAT</name>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>School of Information, Xinjiang Univ</s1>
<s2>Urumqi 830046</s2>
<s3>CHN</s3>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<wicri:noRegion>Urumqi 830046</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
<imprint><date when="2005">2005</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Arabic</term>
<term>Automatic classification</term>
<term>Character recognition</term>
<term>Character set</term>
<term>Chinese</term>
<term>Discriminant function</term>
<term>Feature extraction</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Printed character</term>
<term>Quadratic function</term>
<term>Signal classification</term>
<term>Signal processing</term>
<term>Statistical method</term>
<term>Testing equipment</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Méthode statistique</term>
<term>Caractère imprimé</term>
<term>Arabe</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance optique caractère</term>
<term>Chinois</term>
<term>Jeu caractère</term>
<term>Extraction caractéristique</term>
<term>Fonction quadratique</term>
<term>Fonction discriminante</term>
<term>Classification automatique</term>
<term>Appareillage essai</term>
<term>Reconnaissance forme</term>
<term>Traitement signal</term>
<term>Classification signal</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Méthode statistique</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Although about 300 million people worldwide, in several different languages, take Arabic characters for writing, Arabic OCR has not been researched as thoroughly as other widely used characters (Latin or Chinese). In this paper, a new statistical method is developed to recognize machine-printed Arabic characters. Firstly, the entire Arabic character set is pre-classified into 32 sub-sets in terms of character forms, special zones that characters occupy and component information. Then directional features are extracted based on which modified quadratic discriminant function (MQDF) is utilized as classifier to deal with classification task. Finally, similar characters are discriminated before outputting recognition results. Encouraging experimental results on test sets show the validity of proposed method.</div>
</front>
</TEI>
<affiliations><list><country><li>République populaire de Chine</li>
</country>
<settlement><li>Pékin</li>
</settlement>
</list>
<tree><country name="République populaire de Chine"><noRegion><name sortKey="Hua Wang" sort="Hua Wang" uniqKey="Hua Wang" last="Hua Wang">HUA WANG</name>
</noRegion>
<name sortKey="Halmurat" sort="Halmurat" uniqKey="Halmurat" last="Halmurat">HALMURAT</name>
<name sortKey="Jianming Jin" sort="Jianming Jin" uniqKey="Jianming Jin" last="Jianming Jin">JIANMING JIN</name>
<name sortKey="Xiaoqing Ding" sort="Xiaoqing Ding" uniqKey="Xiaoqing Ding" last="Xiaoqing Ding">XIAOQING DING</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001387 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001387 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= Pascal:05-0360134 |texte= New statistical method for machine-printed Arabic character recognition }}
This area was generated with Dilib version V0.6.32. |